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Adaptive Read Pre-Fetch 




FIELD OF THE INVENTION 
This - inven tion r^lat^s- to compnt ci ? n ? nd pre fetch - 
io fr &"^ 



BACKGROUND OF THE INVENTION 
Exi s ting PCI . bridgco assict in the- luiU/luI Of Lhe * 
[sfequfencing of operations and access to cojja^uter busses in 
10 accordance with the bus specification >fsuch as, for example, 

PCI Local Bus Specification Rev. 2 published by the PCI 
Special Interest Group) . Pre-fet/ch algorithms are not covered 
by the PCI specification, but/are widely employed by PCI 
devices to circumvent a fundamental issue with PCI protocol: 
it does not include a read amount embedded within each 
transaction. Such devices employ a static read pre-fetch which 
requests the same a/ount of information for a particular type 
of read operation/; regardless of the actual demands of the 
requesting agerrc. While this constant pre-fetch amount may be 
adjustable b/ means of a device specific configuration 
register, yche selected amount is constant and applicable to 
all requesting agents served in connection with that register. 
A static pre-fetch amount may result in pre-fetching too much 
- data * 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a block diagram of an adaptive read pre-fetch 
system. 

FIG 2. is a flow chart of an adaptive pre-fetch read 
30 method. 
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DETAILED DESCRIPTION 
ry^ r Red: 1 .r i; nog t o PfG"^"; an g xampl/c adaptive adapt iv - p . 
rddd]prs- fetch system 10 is shown havimg components on a 
bridge/l2. The components include a/pre-fetch factor register 
15, being a re-writeable storage location. The adaptive read 
pre-fetch system 10 also includes^ a re-read pre-fetch factor 
register 20, a re-read timer 25/and a next read address 
register 30. Also shown is pre/f etchable data storage such as 
system memory 40, and agents /35a, 35b and 35c. Each of the 
components of the adaptive read pre-fetch system 10 are 
preferably part of or attached to the computer, such as a 
bridge 12, within which thp pre-fetch factor register 15, the 
re-read pre-fetch factor /register 20, the re-read timer 25, 
and the next read address register 30 may, but need not, 
reside. Also shown in E/ig. 1 is a CPU 42 which communicates 
through a host bridge AA with a PCI primary bus 46. Bridge 12 
ic al30 capable of communicating with the pi^iiiidiy bus 4 5 T* 
An agent 35a, 35b or 35c may be any requesting agent, 
such as an agent on a PCI 2.2 secondary bus 30 connected to a 
bridge 12. An agent may be any of a number of devices capable 
of requesting a memory read operation on the bus. 

At a set time, typically upon system reset, the values in 
the pre-fetch factor register 15 and re-read pre-fetch factor 
register 20 are initialized. 

hen an agent on the bus 30 requests a memory read 
^^erration, it notifies bridge 12 ak the request by asserting 
the appropriate signals on the bus 30. If the bridge 12 
determines that the request is /rom pre-f etchable storage 40, 
it multiplies a pre-defined amount of data requested by the 
number held in the pre-f etch /factor register 15. The amount of 
data to be read depends upon the type of read request as well 
as the particular system design, for example the size of a 



m 

m 



20 



cache line. Table 1 shows the data amounts for three types of 
quo - s t £K — PFFR ic the pro fetch luuLui leglri Lei - v e- lue . - 




Memory Operation 


Alignment 


^ReasU^SITze x — ^ 


Read 


DWORD ^ — 


(1?pfr+1) *4*DWORD 


Read Line 


^^£Laeh*glTl n e 


(PPFR+1) *cacheline 


Read multiply 


2 cachelines 


(PPFR+1) *2 
cachelines 



A cacheline is a series of contiguous bytes of data 
corresponding to the host CPU's cache subsystem. Cachelines 
conform to CPU dependent address alignment. A DWORD is a 
double word, with a length that depends upon the particular 
computer memory configuration. Read operations may be limited 
to cacheline boundaries. Factor is the value contained in the 
pre-fetch factor register, and may be altered during operation 
of the computer by software. 
^ Referring to FIG 2, a flp& chart of an adaptive read pre- 




h method 100 is shown. #t system initialization 105, an 
Ir/itial value for the preVfetch factor register is set. This 
may be in system ROM, oy may be set (and changed from time to 
time) as a parameter \fi the operating system or any other 
system or applicatic^fi software. In one embodiment, pre-fetch 
timer may be initialized to a set time, which will decrement 
to zero unless r<eset. 

>f an agent gives a pre-f e^/chable read request 110 (of 
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sver type) then the read amount, based upon the type of 
reafd, (see table 1) is multiplied by the pre-fetch factor 
plus one, the pre-fetch factor being stored in the pre-fetch 
factor register 15 . Thus, if the value of the pre-fetch 
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factor register is zero, the r^oamounri rs 1 1 mu ^rtT p - liod l ay on* 



:ti^ 



sabling the feature?. 
f^-he-^L^iru^ iii '- Lhe uexL -ap ead ragj rsr telT' 30 is uohlp a gori to tho 
flifie of the read address received fronv the agent. If they are 
the same (meaning that the value in t>ie next read address was 
stored as a result of a prior read i?4quest from the same agent 
which was terminated early for some reason, such as being 
disconnected by the bridge for larck of data) , then the read 
amount is again increased. The lead amount is multiplied by 
one plus the value in the re-tfead pre-fetch factor register 
20. Other implementations coiAd successively automatically 
increase the value in the i?4-read pre-fetch factor register 
for each early terminated/read, and conversely could 
periodicall y UeciemeuL Ll(e re-re ad pge-fetch tactorV^ 

If the address in the read request does not match the 
value in the next-read address 125, the value in the re-read 
pre-fetch register is ignored- In either case, the calculated 
pre-fetch amount is attempted to be read 135. 

Table 2 shows the read size for different memory 
operations using the re-read pre-fetch register (RRPFR) value: 



Read 



DWORD 



( PPFR+l+RRPFR)jJ^^WORD 



Read Line 



cacheline 



(PPFRJ 



tPFR) *cacheline 



Read multiple 



;PPFR+1+RRPFR) *2cachelines 



Table 2 

ri the read terminates early, /then the requesting agent 
Ss rfbt received all of the data /that presumably it presumably 
wan/cs. Early termination occurs/if the bridge disconnects the 
read transaction because data /is exhausted and the requesting 
device is still expecting additional data (i.e. still 
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asserting the PCI bus signal FRAME# . ) Data may become 
exhausted because of a variety/of reasons, including an end of 
Lilc, onhauction of a buffor - rer othor -rduae^t 
^ ^S ETi Lhe cas^ of a ^HnsJ^-e dily Lidriiui -fta t j^a , — fe hc ada - pt - iv er 



5 ^) f>&ka$ pre-fetch process increases the amo^fit of data retrieved 
M)ry the next read request at the same location (where the 
current read ended) from the requesting agent. This is 
accomplished by saving the next^read address (the next address 
at which data would have beep/: retrieved had the read not been 

10 terminated early) and beginning to use the re-read factor and 

_£j3iafft t*T ' id " " T ype specific pfe-tetch amoufrfcrs-r— 

fB | The smart pre-fetch ability may be disabled by 

•at sr 

programming that is accessible during system initialization 
and by the operating system as a parameter. A separate process 
<S may be implemented for each agent on a secondary bus and may 
j Jl also be implemented in the primary bus side as well as the 
n secondary bus. 

H The invention has been described in terms of particular 

Ml 

111 embodiments. Other embodiments are within the scope of the 
\l 

2%l following claims. For example, the process may be implemented 

^ on a bridge, a separate circuit (discrete or integrated) or in 
software, or in combinations of software and firmware or 
circuitry. It may be used successfully in other than a PCI 2.2 
bus system. Not all parts of the described embodiment need be 
25 implemented to achieve beneficial results. 
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What is claimed is 
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